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For nearly 70 years, the Foreign Broadcast Information Service (FBIS ) monitored the world's airwaves and 
other news outlets, transcribing and translating selected content into English and in the process creating a 
multi-million-page historical archive of the global news media. Yet, FBIS material has not been widely uti- 
lized in the academic content analysis community, perhaps because relatively little is known about the scope of 
the content that is digitally available to researchers in this field. This article, researched and written by a spe- 
cialist in the field, contains a brief overview of the service— reestablished as the Open Source Center in 2004 — 
and a statistical examination of the unclassified FBIS material produced from July 1993 through July 
2004 — a period during which FBIS produced and distributed CDs of its selected material. Examined are lan- 
guage preferences, distribution of monitored sources, and topical and geographic emphases. The author 
examines the output of a similar service provided by the British Broadcasting Corporation (BBC), known as 
the Summary of World Broadcasts (SWB). Its digital files permit the tracing of coverage trends from January 
1979 through December 2008 and invite comparison with FBIS efforts. 
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Archival practices of usual 
news sources constrain 
scholarship, especially on 
cross national issues. 



Social scientists rely heavily 
on archival news sources, but 
the selection and archival prac- 
tices of these sources constrain 
scholarship, especially on cross- 
national issues. Contemporary 
news aggregators like Lexis 
Nexis focus on compiling large 
numbers of news sources into a 
single, searchable archive, but 
their historical files are lim- 
ited. Historical sources like the 
Proquest Historical Newspa- 
pers Database offer news con- 
tent back to the mid-19th 
century or earlier, but they 
include only a few publications. 
Both rely nearly exclusively on 
English-language Western news 
sources. 

Global news databases like 
News-Bank's Access World 



News primarily emphasize 
English-language "interna- 
tional" editions of major foreign 
newspapers, which often do not 
represent the views of a 
nation's vernacular news con- 
tent. Nor do these services 
maintain the output of foreign 
broadcast media, especially crit- 
ical in regions with low literacy 
rates. These limitations, for 
example, make it difficult to 
examine such questions as, how 
the international press cover 
the 2002 collapse of the Ameri- 
can communications giant 
WorldCom or, in what ways did 
different regions of the world 
deal with the fallout and its 
impact on their domestic econo- 
mies? Answering such ques- 
tions on a truly international 
scale requires researchers to 



All statements of fact, opinion, or analysis expressed in this article are those of the 
authors. Nothing in the article should be construed as asserting or implying US gov- 
ernment endorsement of an articles factual statements and interpretations. 
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FBIS and BBC have served as strategic resources, maintaining 
relatively even monitoring volume across the globe on a broad 
range of topics. 



have the ability to examine rep- 
resentative samples of news 
reports in countries from across 
the world— print, broadcast, 
and Internet. 1 

The contents of FBIS and 
SWB collections currently 
available to academic research- 
ers provide the material the 
commercial aggregators do not. 
During the period studied for 
this article (1993-2004 for 
FBIS and 1979-2008 for SWB) 
the services have served as 
strategic resources, maintain- 
ing relatively even monitoring 
volume across the globe on a 
broad range of topics, and thus 
provide an ideal foundation for 
cross-national content analysis. 

In addition, the focus of the 
two services on broadcast mate- 
rial has offered critical visibil- 
ity into regions where 
broadcasts are the predomi- 
nant form of popular news dis- 
tribution. The ability to select 
material by geographic and top- 
ical emphasis and to access 
English translations of vernac- 
ular content in print, broad- 
cast, and Internet sources has 
made FBIS material, in partic- 
ular, an unparalleled resource 
for content analysis of foreign 
media. 

A Brief Historical Overview 

Since the beginning of World 
War II, the United States and 
Great Britain have operated 



the world's most extensive 
media monitoring services. 
Known eventually as open 
source intelligence (OSINT) — 
the collection and exploitation 
of n on covert information 
sources, including television 
and radio broadcasts, newspa- 
pers, trade publications, Inter- 
net Web sites, and nearly any 
other form of public dissemina- 
tion. The two services have paid 
particular attention to vernacu- 
lar-language sources aimed at 
domestic populations. 

In some cases OSINT has 
been used simply to gauge local 
reaction to events. In other 
cases, it has been used to sup- 
port estimates of future events 
or to identify rhetorical pat- 
terns or broadcast schedules to 
support intelligence analysis. 
One of the greatest benefits of 
OSINT over traditional covert 
intelligence has been its nearly 
real-time nature (material 
could be examined very soon 
after it was produced) and the 
relative ease and minimal risk 
of its acquisition and dissemi- 
nation. 

Newswire services like the 
Associated Press collect news 
from around the world, but they 
do so primarily through their 
own reporting staffs or string- 
ers. A protest covered in a 
remote province of China is 
likely to be seen through the 
eyes of a Western-trained 
writer or photographer and 



reflect Western views. A domes- 
tic broadcast or newspaper arti- 
cle, on the other hand, reflects 
the perspectives of local popula- 
tions or local authorities, 
depending on the degree of gov- 
ernment control of the media, 
both in its factual reporting and 
the words used to convey that 
information. The global news 
media form a very nonhomoge- 
nous distribution layer and 
news outlets are subject to 
strong cultural and contextual 
influences that may be explored 
through the ways in which they 
cover events. 2 

Known affectionately as 
"America's window on the 
world," 3 FBIS was the back- 
bone of OSINT collection in the 
US Intelligence Community 
(IC), acting as the US govern- 
ment's primary instrument for 
collecting, translating, and dis- 
seminating open-source infor- 
mation. FBIS analysts also 
played primary roles in analyz- 
ing open source information 
and synthesized large amounts 
of material into targeted 
reports. The importance of 
FBIS to the modern intelli- 
gence world was summed up in 
a Washington Times article in 
2001: "so much of what the CIA 
learns is collected from newspa- 
per clippings that the director 
of the agency ought to be called 
the Pastemaster General." 4 

Wartime 

The roots of institutionalized 
OSINT collection in the United 
States can be traced back to the 
Princeton Listening Center 
located in the Princeton Univer- 
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sity School of Public and Inter- 
national Affairs. Funded by the 
Rockefeller Foundation 5 the 
center began operations in 
November 1939 with a mission 
to "monitor, transcribe, trans- 
late, and analyze shortwave 
propaganda broadcasts] from 
Berlin, London, Paris, Rome, 
and, to some extent, Moscow." 6 

A wide range of radio prod- 
ucts was monitored, including 
"news bulletins, weekly topical 
talks, radio news reels, fea- 
tures and dramatizations." Its 
limited staff could only record 
and analyze a sampling of 
broadcasts for propaganda con- 
tent. Topics covered included 
how "propaganda varied 
between countries, as well as 
from one show to another 
within the same country ... the 
way in which specific incidents 
were reported, atrocity refer- 
ences, attitudes toward various 
countries, and the way this pro- 
paganda affected US listeners." 
By April 1941, the listening 
center had compiled over 
15 million words of transcribed 
material from English, Ger- 
man, French, and Italian short- 
wave broadcasts. 

On 26 February 1941 , Presi- 
dent Roosevelt established the 
Foreign Broadcast Monitoring 
Service (FBMS) with orders to 
monitor foreign shortwave 
radio broadcasts from "belliger- 
ent, occupied, and neutral coun- 
tries" directed at the United 
States. 7 FBMS transcribed 
these broadcasts and used them 
to perform "trend analysis to 
discover shifts in tenor or con- 



Intelligence gathering in the uncertain post-WW II world re- 
quired sweeping up a wider range of international media broad- 
casts—too great a task for FBIS to realistically take on by itself. 



tent that might imply changes 
in Japanese intentions." The 
Princeton Listening Center 
became the core of the new 
agency and by the end of 1942, 
it was translating 500,000 
words a day from 25 broadcast- 
ing stations in 15 languages. 8 
FBMS published its first tran- 
scription report on 18 Novem- 
ber 1941 and its very first 
analytical report, dated 
6 December 1941 , contained the 
poignant statement: 

Japanese radio intensi- 
fies still further its 
defiant, hostile tone; in 
contrast to its behavior 
during earlier periods of 
Pacific tension, Radio 
Tokyo makes no peace 
appeals. Comment on the 
United States is bitter 
and increased ; it is broad- 
cast not only to this 
country, but to Latin 
America and S outheast- 
ern Asia. 9 

The Cold War 

On 15 August 1945 FBIS 
recorded Emperor Hirohito's 
surrender announcement to the 
Japanese people, and on 
14 December it published its 
final wartime daily report, hav- 
ing proved its utility to intelli- 
gence during the war. With the 
approbation of the Washington 
Post, which called the service 
"one of the most vital units in a 
sound postwar intelligence 
operation," the service was 
transferred to the Central Intel- 
ligence Group of the National 



Intelligence Authority, forerun- 
ners of the CIA. 10 

Wartime intelligence gather- 
ing required significant 
resources, but they could be 
directed toward a small num- 
ber of countries and sources. 
Intelligence gathering in the 
uncertain post-WW II world 
required sweeping up a wider 
range of international media 
broadcasts— too great a task for 
FBIS to realistically take on by 
itself. Fortunately for Allied 
postwar intelligence, the United 
Kingdom had developed its own 
open source intelligence ser- 
vice, the British Broadcast 
Monitoring Service, just prior to 
the war. From its founding on 
22 August 1939, it produced a 
foreign broadcast compilation 
called the Digest of World 
Broadcasts— renamed the Sum- 
mary of World Broadcasts in 
May 1947. 11 

By 1945, the BBC service was 
monitoring 1.25 million words 
per day in 30 languages, 
although limited resources 
allowed translation into 
English of only 300,000. FBIS, 
on the other hand, transcribed 
and translated the majority of 
the content it monitored. 12 Sub- 
sequently, a British-US agree- 
ment led to a cooperative media 
coverage and sharing arrange- 
ment that has lasted to the 
present day. 13 As a result of the 
agreement, BBC has generally 
focused on Central Asia and 
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The power of OSINT to peer into closed societies, to predict 
major events, and to offer real-time updates cannot be over- 
stated. 



nations that were part of the 
Soviet Union; FBIS has han- 
dled the Far East and Latin 
America, and the two services 
jointly have covered Africa, the 
Middle East, and Europe. The 
agencies also agreed to operate 
under similar "operational and 
editorial standards." 14 

Radio broadcasts and press 
agency transmissions were the 
original focus of FBIS, which 
added television coverage as it 
became more popular. Print 
material became a focus of 
FBIS only in 1967, and by 1992, 
its mission had expanded to 
include commercial and govern- 
mental public-access data- 
bases, and gray literature 
("private or public symposia 
proceedings and academic 
studies"). 15 Even though it did 
not adopt print material until 
1967, substantial news reports 
were usually carried by press 
agencies on their wirefeeds, 
which FBIS monitored nearly 
from the beginning. By 1992, 
the service had developed a net- 
work of 19 regional bureaus, 
which served as collection , pro- 
cessing, and distribution points 
for their geographic areas. 16 

FBIS and BBC have empha- 
sized historically reliable or 
authoritative sources, but FBIS 
continually adds new sources 
and a "not insignificant amount 
of [its] total effort is spent iden- 
tifying and assessing sources to 
ensure the reliability, accuracy, 



responsiveness, and complete- 
ness of ... coverage." 17 By 1992, 
FBIS was monitoring more 
than 3,500 publications in 55 
languages and 790 hours of 
television a week in 29 lan- 
guages from 50 countries. 18 

The power of OSINT to peer 
into closed societies, to predict 
major events and to offer real- 
time updates cannot be over- 
stated. Its utility in the intelli- 
gence analysis process has been 
the subject of numerous stud- 
ies and the testimony of any 
number of senior intelligence 
officials. Suffice it to say here 
that former Deputy Director of 
Central Intelligence William 
Studeman estimated in a 1992 
speech frequently cited in this 
essay that more than 80 per- 
cent of many intelligence needs 
could be met through open 
sources. 19 By the late 1990s, 
FBIS was serving much more 
than IC needs: a 1997 study 
showed that the Law Library of 
Congress was relying heavily 
on FBIS to provide "quality and 
[timely] information to Con- 
gress about legal, legal-politi- 
cal and legal-economic 
developments abroad." 20 

The "basket of sources" nature 
of OSINT has allowed it to 
leverage the combined report- 
ing power of multiple sources, 
reaching beyond the limita- 
tions of any single source. A 
2006 study examining the use 
of OSINT material for event 



identification from news mate- 
rial found the Summary of 
World Broadcasts to be dramat- 
ically superior in volume and 
breadth to traditional commer- 
cial newswires. 21 Newswires, 
with their larger reporting 
infrastructure and geographic 
coverage than newspapers, still 
rely on a single set of reporters 
to cover every country. OSINT 
compilations like FBIS and 
SWB, on the other hand, 
repackage content from across 
the entire globe, combining the 
viewpoints of multiple outlets 
while maintaining fairly com- 
prehensive coverage of national 
presses. 22 

Having briefly, in 1996, faced 
extinction, FBIS was reborn in 
the wake of the Intelligence 
Reform and Terrorism Protec- 
tion Act of 2004 as the Open 
Source Center under a newly 
created Office of the Director of 
National Intelligence. In his 
remarks at a ceremony mark- 
ing OSC's creation, General 
Michael Hayden, then the dep- 
uty director of national intelli- 
gence, noted that OSC "will 
advance the Intelligence Com- 
munity's exploitation of openly 
available information to include 
the Internet, databases, press, 
radio, television, video, geospa- 
tial data, photos and commer- 
cial imagery." 23 

By 2006, OSC reportedly had 
"stepped up data collection and 
analysis to include bloggers 
worldwide and [was] develop- 
ing new methods to gauge the 
reliability of the content." The 
report noted that in order to 
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[FBIS'] customers ... value the work of private sector scholars 
and analysts who avail themselves of our material and contrib- 
ute to the national debate on contemporary issues. 



expand OSINT efforts, OSC had 
doubled its staff and become a 
clearing house for material 
from 32 different US govern- 
ment OSINT units, and its 
translators turned more than 
30 million words a month into 
English from languages across 
the world. 24 

FBIS as the Public's Open 
Source 

Designed to provide the Allies 
an advantage during WW II, 
FBIS, and its successor, has the 
added potential to be a critical 
resource for academic scholars, 
yet the scholarly community's 
lack of familiarity with open 
source methods and the FBIS 
collection in particular, has lim- 
ited academic use of the FBIS 
archive. That archive already 
includes some of the material 
mentioned in Hayden's 
speech— print, broadcast, and 
Internet-derived material- 
translated into English and 
tagged by country and topic and 
is an unparalleled resource for 
understanding news content 
throughout the world across the 
last half-century. 

FBIS reports became widely 
available for public use, in print 
and microfiche forms, in 1974, 
when the Commerce Depart- 
ment's National Technical 
Information Service (NTIS) 
began commercial distribution 
of the material. 25 In his 1992 
speech, Admiral Studeman 
indicated a strong appreciation 
of the private-sector and aca- 
demic research that had arisen 



out of FBIS 's availability out- 
side the US government and 
expressed a commitment to its 
continued availability. As he 
noted, "FBIS's customers in 
both the intelligence and policy 
communities ... value the work 
of private-sector scholars and 
analysts who avail themselves 
of our material and contribute 
significantly to the national 
debate on contemporary 
issues." 26 

The following year, 1993, 
FBIS began to distribute CDs of 
its material to Federal Deposi- 
tory Libraries, a practice that 
lasted until June 2004, when 
FBIS began Internet-only dis- 
tribution through Dialog Corpo- 
ration's World News Connection 
(WNC) service (http://wnc.fed- 
world.gov/), which licenses the 
material from the US govern- 
ment. This Web-based portal 
offers hourly updates and full 
text keyword searching of FBIS 
material from January 1996 to 
the present. 

The CD collection allows 
greater flexibility in accessing 
reports than the Dialog inter- 
face. Dialog only displays 10 
results at a time and offers lim- 
ited interactive refinement 
capabilities. The inaugural CD 
issued in 1993 covers a period 
of nearly one year, but only a 
small number of reports are 
included for the period Novem- 
ber 1992-June 1993. July-Sep- 
tember 1993 is fully covered. 



Thereafter, into June 2004, 
each distributed CD covered 
periods of three months. 



The FBIS Dashboard 

The Pulse of Activity 

FBIS collection during the 
decade following the end of the 
Cold War, as seen in figure 1, 
reflects a relatively stable 
monthly volume through the 
end of 1996, when growth 
started climbing steadily into 
early 2001 , when it stabilized 
again. As noted above, FBIS 
faced severe cuts in 1996, 
before an outpouring of public 
support contributed to its sur- 
vival. This graph indicates that 
the service not only survived 
but found ways (and resources) 
to allow it to more than double 
its monthly output during in 
the next five years. 

The Nature of the Material 

While its primary focus is on 
news material, FBIS also cap- 
tures editorial content and com- 
mentaries, which its monitors 
tag at the beginning of reports. 
Such reports constitute 6.3 per- 
cent of the collection — 3.5 per- 
cent are flagged as editorial 
content and 2.8 percent as com- 
mentaries. Editorial and com- 
mentary content represented 5- 
6 percent of each year's total 
reports through 1999, but in 
2000 the percentage increased 
nearly 1 percent each year to a 
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Figure 1: Monthly FBIS Volume, November 1992-June 2004 





During this period, FBIS compiled 4,393,121 reports. The monthly distribution of these reports as collected in the CDs is shown in 
blue. The low number in the first months reflects the small number of reports transferred to CD at the beginning of the effort. The 
magenta points show the number of titles listed in an index of printed FBIS reports prepared under contract by NewsBank, Inc. 
Newsbank's index shows a lower volume of reports (about 30 percent less on average per month), possibly because apparent 
duplicate reports were not listed. (No copy of CD #39 (May/June 2002) could be located and could not be included in this 
analysis.) 



Figure 2: Daily FBIS Volume, June 1995-August 1995. 
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Daily reporting volumes, as seen in this three-month snapshot from 1 995, indicate that FBIS daily reporting patterns resemble 
those of major news aggregators, except that FBIS' lowest volumes occur on Sundays instead of Saturdays. This may reflect FBIS 
staffing patterns or other factors in international news activity. 
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Table 1 : Top 25 Source Languages Table 2: Topics Covered, 1999-2004 



Table 3: Top 25 Media Outlets 



Origin Language 


Report Count 


% All 
Reports 


English 


2021021 


46.00 


Russian 


371106 


8.45 


Arabic 


271326 


6.18 


Spanish 


197451 


4.49 


French 


138046 


3.14 


Serbo-Croatian 


135805 


3.09 


Chinese 


124014 


2.82 


Persian 


80720 


1.84 


German 


76688 


1.75 


Portuguese 


66003 


1.50 


Turkish 


65951 


1.50 


Hebrew 


51670 


1.18 


Japanese 


50509 


1.15 


Korean 


47113 


1.07 


Albanian 


40898 


0.93 


Italian 


39060 


0.89 


Urdu 


31705 


0.72 


Ukrainian 


31608 


0.72 


Indonesian 


29359 


0.67 


Greek 


28564 


0.65 


Polish 


28372 


0.65 


Hungarian 


26392 


0.60 


Slovak 


22980 


0.52 


Bulgarian 


22920 


0.52 



Topic 


Report Count 


% Reports 


Domestic Political 


1204515 


44.94 


International Political 


1164586 


43.45 


Leader 


927241 


34.59 


Military 


459898 


17.16 


Domestic Economic 


411593 


15.36 


International Economic 


357610 


13.34 


Terrorism 


277667 


10.36 


Urgent 


245054 


9.14 


Human Rights 


196205 


7.32 


Political 


187128 


6.98 


Crime 


129829 


4.84 


International 


128016 


4.78 


Domestic 


116710 


4.35 


Dissent 


101710 


3.79 


Media 


84157 


3.14 


Energy 


83920 


3.13 


Technology 


63072 


2.35 


Proliferation 


63003 


2.35 


Peacekeeping 


59076 


2.20 


Environment 


55814 


2.08 


Economic 


55157 


2.06 


Health 


49847 


1.86 


Migration 


40435 


1.51 


Telecom 


37711 


1.41 


Narcotics 


35017 


1.31 


Conflict 


32656 


1.22 





Source 


Report Count 


% All 
Reports 


Beijing XINHUA 


194316 


4.42 


Moscow ITAR-TASS 


155925 


3.55 


Tokyo KYODO 


123404 


2.81 


Seoul YONHAP 


92722 


2.11 


Tehran IRNA 


57857 


1.32 


Paris AFP 


56286 


1.28 


Hong Kong AFP 


44390 


1.01 


Prague CTK 


39201 


0.89 


Ankara Anatolia 


31436 


0.72 


P'yongyang KCNA 


29824 


0.68 


Moscow INTERFAX 


29141 


0.66 


Belgrade BETA 


28717 


0.65 


Belgrade TANJUG 


28381 


0.65 


Cairo MENA 


26764 


0.61 


Pyongyang KCNA 


26230 


0.60 


Zagreb HINA 


23013 


0.52 


Taipei Central News Agency 
WWW-Text 


22983 


0.52 


Moscow RIA 


22071 


0.50 


Tokyo Jiji Press 


21508 


0.49 


Moscow Nezavisimaya Gazeta 


20371 


0.46 


Moscow Agentstvo Voyennykh 
Novostey WWW-Text 


19931 


0.45 


Jerusalem Qol Yisra'el 


19896 


0.45 


Madrid EFE 


18973 


0.43 


Warsaw PAP 


17903 


0.41 



peak of just over 9 percent in 
2003. 

The proportion of excerpted 
reports over the study period 
was relatively low, —averaging 
around 5.6 percent per year — 
making FBIS material ideal for 
content analysis. Longer broad- 
cast or print reports are 
excerpted when only portions of 
an item are relevant to tar- 
geted subject areas. For exam- 
ple, a Radio France 
International broadcast might 
have been excerpted to tran- 
scribe just those comments 
about an African country's 
denunciation of a trade 
embargo against it or a brief 
mention of a party official's 
death in a People's Republic of 
China radio broadcast might be 
extracted from other unimpor- 
tant material. 27 

Language 

English-language material 
comprises approximately 46 



percent of the material FBIS 
collects. Such material repre- 
sents a saving in translation 
expenses and, when coming 
from media controlled by 
authoritarian regimes, poten- 
tially authoritative messages to 
US and other Western govern- 
ments. Table 1 shows the top 25 
source languages for FBIS 
reports during 1992-2004. 
After English, Russian and 
Arabic reports were the most 
frequently collected. 

Topics 

On 1 January 1999, FBIS 
began to include topical cate- 
gory tags in its reports, each of 
which could have as many tags 
as necessary to fully describe 
its contents. As table 2 shows, 
however, political issues topped 
FBIS collection, comprising 
nearly 83 percent of all con- 
tent. Economic issues 
accounted for 26 percent. From 
January to July 1999, reports 
were also categorized sepa- 



rately as "international" or 
"domestic" and "political" or 
"economic." In August 1999 the 
specialized categories "domes- 
tic political," "international 
political," "domestic economic," 
and "international economic" 
were introduced. All other cate- 
gories ran continuously from 
January 1999 until the end of 
this sampling period. 

Media Outlets 

Content analysts must con- 
sider the volume of material 
produced by each source to 
ensure that no one media out- 
let dominates in their analyses. 
Table 3 lists the top 25 media 
outlets from which FBIS 
selected content during the 
study period from a universe 
exceeding 32,000 sources. 
(Because FBIS citations often 
distinguish between Web and 
print editions of a source and 
between different editions of a 
source— international, regional, 
local, weekend editions— the 
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Understanding the physical location of each source is critical to 
exploring possible geographic biases in monitoring. 



actual number of unique 
sources noted in the table is 
probably significantly lower 
than the number shown.) In 
any case, taken together, selec- 
tions from the top 25 outlets 
accounted for more than a quar- 
ter of all FBIS-selected mate- 
rial during this period. Though 
this small proportion of the 
world's media outlets domi- 
nated FBIS collection, they are 
outlets with national stature 
and international importance. 

The Geography of Coverage 

Understanding the physical 
location of each source is criti- 
cal to exploring possible geo- 
graphic biases in monitoring. 
Unfortunately, while FBIS 
source references do indicate 
the geographic location of 
sources, they do not do so in a 
regular format, so an extensive 
machine geocoding system was 
used to automatically extract 
and compute GIS-compatible 
latitude and longitude coordi- 
nates for each FBIS source. In 
all, coordinates were calculated 
for 97.5 percent of reports and a 
random sample of 100 entries 
checked by hand showed no 
errors. 

The maps on the following 
pages (figures 3-6) subdivide 
sources by geographic location, 
situating each in its listed city 
of origin. Immediately notice- 
able are the strong similarities 
between the maps, showing 
that FBIS heavily overlapped 
its coverage in each region, 



combining broadcast , print , and 
Internet sources together. This 
mitigated the potential biases 
of any one distribution format. 
For example, in the Arab 
media, low general literacy 
rates mean that broadcast 
media formed the primary dis- 
tribution channel for the 
masses and so is subjected to 
greater censorship than print 
media, which targets the elite. 28 

After print material was 
added to FBIS collection in 
1967, it became the dominant 
source for FBIS reports, consti- 
tuting just over one-half of 
FBIS sources during the study 
period. (See figure 4.) To deter- 
mine the source type of each 
outlet, the full reference field of 
each report was examined. Any 
reference that contained a time 
stamp (such as 1 130 GMT) was 
considered a broadcast source, 
while those containing the key- 
words "Internet," "electronic," 
or "www" were flagged as Inter- 
net editions. All remaining 
sources were assumed to be 
print sources. 

As table 3 illustrates, some 
sources contributed a much 
larger volume than others, so 
the total number of reports 
gathered from sources of each 
type was also computed. A total 
of 25 percent of reports were 
from print sources, 25 percent 
were from Internet sources, and 
51 percent were from broadcast 
sources. (See figures 5 and 6.) 
Thus, more than half of all 



reports during the study period 
were attributed to broadcast 
outlets, in keeping with the 
FBIS broadcast heritage. This 
also makes conceptual sense in 
that broadcast outlets tradition- 
ally operate 24/7, while print 
outlets usually issue only a sin- 
gle edition each day, meaning 
there is far more broadcast 
material to monitor. A smaller 
number of broadcast stations 
transmitting throughout the 
day will be able to generate far 
more content than a large num- 
ber of print outlets with a lim- 
ited amount of page space. 

Figure 7 shows the geo- 
graphic distribution, by coun- 
try, of monitored reports. It is 
important to note that devel- 
oped countries (for example, 
France) may act as reporting 
surrogates for lesser developed 
neighbors or for countries in 
which their sources have inter- 
est. The sources in the devel- 
oped countries, of course, also 
have better established media 
distribution networks. Since 
there is no independent, 
authoritative master list of 
media outlets by country that 
covers print, broadcast, and 
Internet sources, there is no 
way of knowing what percent- 
age of the media in each coun- 
try and the total news volume 
they generated was captured by 
FBIS. 

In January 1994, FBIS edi- 
tors began assigning geo- 
graphic tags to their reports. 
Geographic tags describe the 
geographic focus of a report — 
not the location of a report's 
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Figure 3: Locations of all sources monitored by FBIS during 1992-2004. 
About 83 percent of the shown locations are national capitals. 




Figure 4: FBIS broadcast sources (TV, radio, shortwave), 1992- 
2004. Broadcast sources constituted 15 percent of the sources FBIS 
monitored during the period. 
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Figure 5: FBIS Internet source locations. These include Internet-only and Internet editions of print sources monitored 
during 1992-2004 




Figure 6: Locations of FBIS print sources monitored during 
1992-2004. Although print sources constituted 51 percent of 
monitored sources, only 25 percent of issued reports were sourced 
to print material (see graph on right). Broadcast material still ranked 
first as sources for published reports. 
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source. A Chinese newspaper 
article describing events in 
India would have a tag only for 
India and not China, unless 
China played a major role in 
the report's contents. Combin- 
ing the geographic information 
from the source reference with 
the geographic tags makes it 
possible to search for reports 
from one country that describe 
events in another country. 
Despite the potential for bias 
toward activities related to the 
United States, only 12 percent 
of articles published during this 
period actually had geographic 
tags for the United States, 
although the United States is 
the most frequently applied tag. 
(See table 5.) During this 



period, Russia was the second 
most frequently tagged country. 

A critical question in the 
study of this material is 
whether there has been any 
systematic bias toward moni- 
toring a greater number of 
sources or gathering a greater 
number of reports in countries 
deemed to be hot spots by the 
United States. Alternatively, 
FBIS might have gathered 
reports uniformly across the 
world but focused primarily on 
those about the United States. 
Figure 7 shows that China and 
Russia provided the most mate- 
rial, more than 20 percent of all 
reports in the CDs from this 
period. Together with the 



Table 4: Top 25 countries by 
number of articles from sources in 
that country, 1994-2004 



Country 


Report Count 


% Reports 


Russia 


478817 


10.90 


China 


466682 


10.62 


Japan 


216446 


4.93 


Iran 


170214 


3.87 


South Korea 


152083 


3.46 


France 


145677 


3.32 


Serbia & Montenegro 


143009 


3.26 


United Kingdom 


93609 


2.13 


Turkey 


81982 


1.87 


North Korea 


78845 


1.79 


India 


74019 


1.68 


Belgium 


69070 


1.57 


Germany 


66887 


1.52 


Israel 


62254 


1.42 


Bangladesh 


60994 


1.39 


Egypt 


59810 


1.36 


Bosnia & Herzegovina 


58579 


1.33 


South Africa 


58273 


1.33 


Czech Republic 


58209 


1.33 


Italy 


54091 


1.23 


Bulgaria 


47388 


1.08 


Indonesia 


45243 


1.03 


Ukraine 


43632 


0.99 


Romania 


43015 


0.98 


Poland 


42309 


0.96 
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Table 5: Top 25 countries mentioned in 
all reporting, 1994-2004 



Country 


Report Count 


% Reports 


United States 


473726 


12.20 


Russia 


420446 


10.83 


China 


337852 


8.70 


Japan 


247578 


6.38 


Iran 


193618 


4.99 


Israel 


188803 


4.86 


South Korea 


179029 


4.61 


Iraq 


173234 


4.46 


North Korea 


138658 


3.57 


India 


134086 


3.45 


Pakistan 


131563 


3.39 


United Kingdom 


124512 


3.21 


Turkey 


114355 


2.95 


West Bank & Gaza Strip 


110829 


2.86 


Afghanistan 


102059 


2.63 


France 


96178 


2.48 


Germany 


89763 


2.31 


Federal Republic of Yugoslavia 


82788 


2.13 


Taiwan 


82115 


2.12 


Serbia 


78144 


2.01 


Kosovo 


75940 


1.96 


Egypt 


71914 


1.85 


Bosnia Herzegovina 


70565 


1.82 


Italy 


63011 


1.62 


Indonesia 


58845 


1.52 



United States, China and Rus- 
sia account for more than 30 
percent of the geographic focus 
of all reports. (See figure 8.) 
However, Russia and China are 
also regional superpowers hav- 
ing significant interaction with 
their neighbors in the Eastern 
Hemisphere and thus are ide- 
ally positioned to report on 
events in that region. 

Since reports collected in a 
given country are not necessar- 
ily about that country, useful is 
a comparison of the percentage 
of all reports sourced from a 
country with those having a 
geographic topic tag for that 
country. Figure 9 shows geo- 
graphic sources and sinks- 
countries (in blue) about which 



more reports are collected from 
outside their borders than from 
within their borders. South 
America is net neutral overall, 
with similar volumes of reports 
being sourced from each coun- 
try as are monitored and 
reported about that country. 

Africa as a whole is a net sink, 
with many more reports pro- 
duced about that continent 
than are sourced from it. This is 
both the result of relatively 
underdeveloped media distribu- 
tion networks and greater bar- 
riers to collection of material 
from African locations. This 
reality presents significant 
challenges to analysts, who 
must deal with content about 
these nations collected from 
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Figure 9: FBIS source (orange to red) and "Sink" (gray to blue) countries, 1994-2004. 



outside their borders and sub- 
ject to foreign, rather than 
domestic, views on internal 
events. By contrast, France is a 
net source, largely because of 
the presence of Agence France 
Presse (AFP) wire service. Simi- 
larly, BETA and TANJUG news 
agencies in Belgrade contrib- 
uted to Serbia's ranking as a 
net source during this period. 

The coverage statistics do not 
appear to indicate that FBIS 
appreciably favored regions in 
which the United States was 
actively engaged during 1994- 
2004. The figures reflect a fairly 
even coverage outside Russia 
and China without redirecting 
resources toward more prob- 



lematic regions. This suggests 
that FBIS provided a strategic 
service, monitoring all regions 
of the world relatively evenly 
rather than a tactical resource 
focused on troublesome areas. 
This is a critical attribute for 
using this material in content 
analysis. 

❖ ❖ ❖ 



BBC Summary of World 
Broadcasts 

Whereas public access to his- 
torical digital FBIS content 
only began in July 1993, and 
public access to content after 
2004 is limited by the technical 
constraints of the Dialog search 
interface, material from the 
SWB service has been avail- 
able since 1 January 1979 
through LexisNexis. Like FBIS, 
SWB today monitors media 
from 150 countries in more 
than 100 languages from over 
3,000 sources. It has overseas 
bureaus in Azerbaijan, Egypt, 
India, Kenya, Russia, Ukraine, 
and Uzbekistan and a staff of 
around 500. 29 It has a wide cor- 
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porate following, first appear- 
ing in the Reuters Business 
Briefing newswire in 1983, and 
in 2001 was one of the 10 most 
popular news sources in that 
service. 30 

SWB's mission is to focus on 
"political, economic, security, 
and media news, comment, and 
reaction ." The service acknowl- 
edges geographic prioritization: 
Iraq and Afghanistan are "pri- 
ority one countries," and the 
volume of coverage of Pakistani 
media has more than tripled 
since 2003 as greater monitor- 
ing resources were brought to 
bear on that region. 31 

Unlike FBIS, whose budget fell 
under the secrecy guidelines of 



the intelligence community that 
housed it, BBC publishes basic 
annual financial figures, offer- 
ing some insights into the scope 
of its operations. During 
2008/2009, its total budget was 
approximately £28.7 million 
($45.9 million), of which £24.6 
million came from the British 
government, £1.4 million from 
commercial licensing, and £2.6 
million from lessees, interest , 
and income from the Open 
Source Center. Expenditures 
included £15.1 million for staff, 
£3.6 million for "accommoda- 
tion, services, communications, 
maintenance, and IT," £479,000 
for copyright clearances, £3.8 
million for "other" and £3 mil- 
lion for depreciation. 32 The gov- 
ernmental portion of its funding 



for 1994/95 was approximately 
£18.4 million ($28.7 million), 
suggesting generally stable lev- 
els of governmental support over 
the past decade and a half. 33 

Editorial Process 

FBIS and SWB are renowned 
for the extremely high quality 
of their translations, which 
often capture the tone and 
nuance of the original vernacu- 
lar. Such translation quality 
requires a high level of edito- 
rial input, including iterative 
revision processes in both ser- 
vices. Changes in translation, 
however, manifest themselves 
in ways that complicate con- 
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Figure 11: Distribution of BBC Summary of World Broadcast Reports by Country, 1994-2004. 



tent analysis of the FBIS and 
SWB databases. 

In FBIS it is possible that an 
editor or a downstream con- 
sumer might inquire about 
aspects of a given translation 
for clarification or amplifica- 
tion and prompt a retransla- 
tion. This is especially 
prevalent with broadcast trans- 
missions, which can suffer from 
interference that make pas- 
sages unclear. 

But FBIS methods for 
accounting for such changes 
were inconsistent. An FBIS 
translation or transcription 
that was substantially changed 
might have been reissued to the 
wire. In some cases a notation 



was provided, such as a 1998 
FBIS report drawn from Radio 
France Internationale that 
noted at the beginning: "Cor- 
rected version of item origi- 
nally filed as ab0909100698; 34 
editorial notes within body of 
item explained changes made." 
The corrected report was 
assigned its own unique FBIS 
ID, AB09091 13898, and since 
no structured field existed in 
the database system on the CDs 
to connect the two reports, an 
analyst would have to read the 
note in order to recognize that 
reports are the same item. 
Researchers conducting auto- 
mated queries, such as a time- 
series analysis, would find this 
item double counted. 



Unfortunately, acknowledge- 
ment of revisions in both collec- 
tions is the exception rather 
than the norm. The FBIS 
reports studied show duplica- 
tion of about 1 to 2 percent per 
day. In some cases, it is only the 
title that changes or a dupli- 
cate report may simply have 
been an error, such as a 5,530- 
word report from 2001 that was 
reissued later the same day 
without the last 731 words. 35 In 
another case, a 1 January 2001 
article about NATO changed 
"Foreign Minister" Colin Pow- 
ell to "Secretary of State" and 
the fate of the "enlargement" of 
the North Atlantic Alliance 
became simply the fate of the 
"Alliance itself." 36 A sentence 
was also moved down in the 
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first paragraph, together with 
several other smaller changes, 
altering nearly 10 percent of 
the total text. In both cases, the 
duplicate reports had their own 
unique identifiers but contain 
no information linking them to 
their originals. 

For the entire period 1979- 
2008, the Lexis SWB archive 
contains 4,694,122 reports (dis- 
counting separate summary 
reports of fuller accounts). 
Analysis of the reports showed 
that nearly 1 million of these 
reports were duplicates. 

SWB content accessed 
through Lexis for the years 
1998-2002 showcases this revi- 
sion process and underscores 
the challenges for content ana- 
lysts. Curiously, explanations 
for this duplication differ over 
two periods of time over these 
five years. The easier period to 
explain is the period from 
March 2001-December 2002, 
when nearly half of all report- 
ing was duplicated. Duplicates 
during this period are in most 
instances identical copies of 
earlier reports, with the excep- 
tion of some extraneous format- 
ting characters. Simple textual 
comparison of all reports issued 
on each day identified the 
duplicates. This accounted for 
about 700,000 duplicates. 

The remaining reports, which 
run from January 1998 through 
March 2001, present a much 
more significant analytical 
challenge. The duplicates dur- 
ing this period are not identical 
copies. They are retranslations 
of earlier reports. Some only 
have changes in titles, for 
example, "inaugurated" becom- 
ing "set up" or "Montenegrin 



outgoing president" changing to 
"outgoing president." 34 How- 
ever, most include changes to 
the body text itself, such as a 
24 January 1998 Romanian 
Radio broadcast that first 
appeared in Lexis on the 25th, 
with a revised edition issued 
the following day. 35 Seven 
changes were made to the body 
text, including "make" changed 
to "do" and "make the reform" 
becoming "carry out reforms." 
Several words were changed 
from singular to plural or vice- 
versa, while monitor's com- 
ments were inserted to indicate 
the speaker for different pas- 
sages. In all, nearly 4 percent of 
the report's total text was 
changed. 

Linking articles containing 
multiple substantive changes of 
this kind is a non-trivial task: 
sentence order may be revised, 
words changed, and phrases 
added or deleted. Simple tex- 
tual comparison will not suffice 
and more advanced detection 
tools are required. Titles can 
also change. Unfortunately, 
SWB uses the same timestamp 
in the source citations of all 
reports from the same broad- 
cast, meaning that header 
fields do not provide informa- 
tion to help distinguish dupli- 
cates. Instead, full text 
document clustering is 
required, a technique that com- 
putes overlap in word usage 
between every possible combi- 
nation of documents for a given 
day. If two documents overlap 
by 90 percent or more, they are 
considered duplicates. 

Such an approach allows for 
fully automated detection and 
removal of duplicates, with 
extremely high accuracy (a ran- 



dom sample of days checked, for 
example, revealed no false posi- 
tives). In all, the 38 months of 
this period exhibit an average 
of 42-percent duplication, with 
a high of nearly 65 percent in 
January 2001. With clustered 
duplicates removed, a total of 
3,700,761 unique reports 
remain from the original nearly 
4.7 million reports. 

Even this approach can only 
identify reports with relatively 
minor alterations. Wholesale 
rewrites— those that keep fac- 
tual information the same, but 
substantially or completely 
altered wording— cannot 
readily be detected through 
purely automated means. For 
example, a January 1998 report 
about rice prices was initially 
released containing numerous 
monitor comments indicating 
unclear transcription. The 93- 
word transcript was rereleased 
nine days later as a 50-word 
paraphrased edition. 36 A 303- 
word transcript the same 
month concerning enactment of 
a tax law in Russia was re- 
released six days later, cut 
nearly in half, again with heavy 
paraphrasing and rewriting. 37 
In both cases the 'Text of 
Report" header denoting a full- 
text transcript was removed 
from the subsequent report, 
suggesting an explicit decision 
on the part of the monitoring 
staff to switch from a literal 
translation to a paraphrased 
summary. A manual review of 
content during this period sug- 
gests that this activity may be 
restricted to broadcast content, 
which presents the greatest 
challenges for accurate tran- 
scription. 
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Figure 12: SWB and FBIS Source Locations, 1994-2004. 



SWB and FBIS Coverage 
Compared 

FBIS and SWB had a long his- 
tory of sharing content. The 
maps on this and the next page 
(figures 12 and 13) show the 
similarity of the two services 's 
geographic emphases. (Their 
Pearson correlation is r=0.84 
[N=191], suggesting very strong 
overlap.) 

Unfortunately, source refer- 
ences are constructed very dif- 
ferently in the two collections, 
so it is only possible to compare 
source listings geographically. 
Figure 12 locates all SWB and 
FBIS sources during this 
period. To simplify the map ren- 
dering, if SWB and FBIS both 
have a source at a given loca- 
tion, the FBIS map point may 
be obscured by the SWB point. 



The data show that FBIS draws 
from a larger selection of 
sources in a broader geo- 
graphic range than does SWB. 

Unlike FBIS, SWB draws 
some content from sources 
based in the United States (pri- 
marily US sources aimed at for- 
eign audiences), but those 
account for only a small frac- 
tion of its content and are not 
shown here. FBIS is a much- 
higher-volume service, generat- 
ing an average monthly volume 
of just over two and a half times 
that of SWB from 1993-2004, 
which may also account for the 
larger number of sources. 

Shifting Coverage Trends 

Because SWB content is avail- 
able in digital format back to 1 
January 1979, it is possible to 
analyze a 30-year span to trace 



the evolution of geographic cov- 
erage of monitored material. 

As shown in figure 13, which 
illustrates the total change in 
coverage density from 1979 to 
2009, relatively large increases 
have taken place in coverage of 
Iran and Pakistan; little change 
can be seen in other Middle 
Eastern nations, notwithstand- 
ing increased Western military 
presences in Iraq and Afghani- 
stan; and declines have 
occurred in coverage of Russia 
and China, where the decline 
has been the most pronounced. 
If SWB coverage can , indeed, be 
used to infer levels of US cover- 
age of open sources today, these 
data support the argument that 
open source resources are not, 
by and large, retasked to mili- 
tary conflict zones and provide 
instead a strategic resource. 
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Figure 13: SWB Coverage Density Change, 1979-2008. 



Figures 14Aand 14B show 
coverage shifts in five-year 
increments during this period. 
(Western Hemisphere coun- 
tries are not shown because 
there was relatively little 
change in the period.) These 
graphs further highlight the 



evenness of SWB coverage 
throughout the world and the 
sustained emphasis on Russia 
and China, mirroring FBIS's 
focus on these two countries. 
The impact on analysis of such 
stable sourcing cannot be over- 
stated. While countless studies 



examine the geographic biases 
in Western reporting of interna- 
tional events, SWB appears to 
be largely immune to such 
selection biases, with African 
and Latin American countries 
receiving nearly the same 
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US and British OS I NT services' ability to penetrate into the 
non-Western world will make their products central to the next 
wave of social science research. 



attention as their European 
counterparts. 

The relatively intense cover- 
age of Russia and China, how- 
ever, is more troubling for those 
seeking to do broad-based 
research. All six maps use the 
same color scale, showing that 
Russian emphasis has 
remained nearly constant for 
three decades. Emphasis on 
China, on the other hand, has 
decreased nearly linearly over 
this period. 

Increases in coverage of some 
areas evident in these maps- 
Greece, Poland, and India, for 
example— track with height- 
ened security concerns during 
the periods. 

Conclusion 

Notwithstanding recent criti- 
cism of US neglect of open 
source intelligence, the record of 
US and British collection of such 
intelligence evident in publicly 
available collections reflects a 
longstanding US and British 



understanding of the impor- 
tance of realtime, uniform moni- 
toring of the media output of 
nations around the globe. 

For the academic researcher, 
the two services in effect act as 
time machines, allowing social 
and political scientists, histori- 
ans, and others to turn back the 
clock to revisit events in inno- 
vative ways. While the goals of 
intelligence analysts using 
OSINT are different from those 
of academic researchers, their 
needs and methodologies are 
similar. On the academic side, 
content analysts of interna- 
tional events have historically 
been limited by the constraints 
of commercial news databases 
dominated by Western media. 
With increasing globalization of 
so many social, economic, and 
political phenomena, scholars 
will have to abandon reliance 
on Western newspapers and 
look elsewhere. 



The ability of US and British 
OSINT services to penetrate into 
the non-Western world will 
make their products central to 
the next wave of social science 
research. They operate as an 
almost ideal strategic monitor- 
ing resource, with nearly even 
coverage across the globe, and 
offer a unique view into the 
broadcast news media that dom- 
inate many regions of the world. 
Their political and economic 
focus and full-text English trans- 
lations make them a powerful 
resource for international news 
studies. As the world grows 
smaller, OSINT offers academic 
scholars an unparalleled comple- 
ment to existing commercial 
databases and provides a unique 
opportunity for academia and 
government to collaborate in fur- 
thering our understanding of the 
global news media and the 
insights it can provide into the 
functioning of societies. 



Figure 14B: SWB Coverage Trend, 1994-2008 
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